MT Tuning on RED: A Dependency-Based Evaluation Metric

نویسندگان

  • Liangyou Li
  • Hui Yu
  • Qun Liu
چکیده

In this paper, we describe our submission to WMT 2015 Tuning Task. We integrate a dependency-based MT evaluation metric, RED, to Moses and compare it with BLEU and METEOR in conjunction with two tuning methods: MERT and MIRA. Experiments are conducted using hierarchical phrase-based models on Czech–English and English–Czech tasks. Our results show that MIRA performs better than MERT in most cases. Using RED performs similarly to METEOR when tuning is performed using MIRA. We submit our system tuned by MIRA towards RED to WMT 2015. In human evaluations, we achieve the 1st rank in all 7 systems on the English–Czech task and 6/9 on the Czech– English task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RED: A Reference Dependency Based MT Evaluation Metric

Most of the widely-used automatic evaluation metrics consider only the local fragments of the references and translations, and they ignore the evaluation on the syntax level. Current syntaxbased evaluation metrics try to introduce syntax information but suffer from the poor parsing results of the noisy machine translations. To alleviate this problem, we propose a novel dependency-based evaluati...

متن کامل

A New Syntactic Metric for Evaluation of Machine Translation

Machine translation (MT) evaluation aims at measuring the quality of a candidate translation by comparing it with a reference translation. This comparison can be performed on multiple levels: lexical, syntactic or semantic. In this paper, we propose a new syntactic metric for MT evaluation based on the comparison of the dependency structures of the reference and the candidate translations. The ...

متن کامل

A Customizable MT Evaluation Metric for Assessing Adequacy Machine Translation Term Project

This project describes a customizable MT evaluation metric that provides system-dependent scores for the purposes of tuning an MT system. The features presented focus on assessing adequacy over uency. Rather than simply examining features, this project frames the MT evaluation task as a classi cation question to determine whether a given sentence was produced by a human or a machine. Support Ve...

متن کامل

PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

Many machine translation (MT) evaluation metrics have been shown to correlate better with human judgment than BLEU. In principle, tuning on these metrics should yield better systems than tuning on BLEU. However, due to issues such as speed, requirements for linguistic resources, and optimization difficulty, they have not been widely adopted for tuning. This paper presents PORT 1 , a new MT eval...

متن کامل

Feasibility of Minimum Error Rate Training with a Human- Based Automatic Evaluation Metric

Minimum error rate training (MERT) involves choosing parameter values for a machine translation (MT) system that maximize performance on a tuning set as measured by an automatic evaluation metric, such as BLEU. The method is best when the system will eventually be evaluated using the same metric, but in reality, most MT evaluations have a human-based component. Although performing MERT with a h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015